8 research outputs found

    A Toolbox for Functional Analysis and the Systematic Identification of Diagnostic and Prognostic Gene Expression Signatures Combining Meta-Analysis and Machine Learning

    Get PDF
    The identification of biomarker signatures is important for cancer diagnosis and prognosis. However, the detection of clinical reliable signatures is influenced by limited data availability, which may restrict statistical power. Moreover, methods for integration of large sample cohorts and signature identification are limited. We present a step-by-step computational protocol for functional gene expression analysis and the identification of diagnostic and prognostic signatures by combining meta-analysis with machine learning and survival analysis. The novelty of the toolbox lies in its all-in-one functionality, generic design, and modularity. It is exemplified for lung cancer, including a comprehensive evaluation using different validation strategies. However, the protocol is not restricted to specific disease types and can therefore be used by a broad community. The accompanying R package vignette runs in ~1 h and describes the workflow in detail for use by researchers with limited bioinformatics training

    Reference Interval Estimation from Mixed Distributions using Truncation Points and the Kolmogorov-Smirnov Distance (kosmic)

    Get PDF
    Appropriate reference intervals are essential when using laboratory test results to guide medical decisions. Conventional approaches for the establishment of reference intervals rely on large samples from healthy and homogenous reference populations. However, this approach is associated with substantial financial and logistic challenges, subject to ethical restrictions in children, and limited in older individuals due to the high prevalence of chronic morbidities and medication. We implemented an indirect method for reference interval estimation, which uses mixed physiological and abnormal test results from clinical information systems, to overcome these restrictions. The algorithm minimizes the difference between an estimated parametrical distribution and a truncated part of the observed distribution, specifically, the Kolmogorov-Smirnov-distance between a hypothetical Gaussian distribution and the observed distribution of test results after Box-Cox-transformation. Simulations of common laboratory tests with increasing proportions of abnormal test results show reliable reference interval estimations even in challenging simulation scenarios, when <20% test results are abnormal. Additionally, reference intervals generated using samples from a university hospital’s laboratory information system, with a gradually increasing proportion of abnormal test results remained stable, even if samples from units with a substantial prevalence of pathologies were included. A high-performance open-source C++ implementation is available at https://gitlab.miracum.org/kosmic

    KETOS: Clinical decision support and machine learning as a service – A training and deployment platform based on Docker, OMOP-CDM, and FHIR Web Services

    Get PDF
    Background and objective To take full advantage of decision support, machine learning, and patient-level prediction models, it is important that models are not only created, but also deployed in a clinical setting. The KETOS platform demonstrated in this work implements a tool for researchers allowing them to perform statistical analyses and deploy resulting models in a secure environment. Methods The proposed system uses Docker virtualization to provide researchers with reproducible data analysis and development environments, accessible via Jupyter Notebook, to perform statistical analysis and develop, train and deploy models based on standardized input data. The platform is built in a modular fashion and interfaces with web services using the Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) standard to access patient data. In our prototypical implementation we use an OMOP common data model (OMOP-CDM) database. The architecture supports the entire research lifecycle from creating a data analysis environment, retrieving data, and training to final deployment in a hospital setting. Results We evaluated the platform by establishing and deploying an analysis and end user application for hemoglobin reference intervals within the University Hospital Erlangen. To demonstrate the potential of the system to deploy arbitrary models, we loaded a colorectal cancer dataset into an OMOP database and built machine learning models to predict patient outcomes and made them available via a web service. We demonstrated both the integration with FHIR as well as an example end user application. Finally, we integrated the platform with the open source DataSHIELD architecture to allow for distributed privacy preserving data analysis and training across networks of hospitals. Conclusion The KETOS platform takes a novel approach to data analysis, training and deploying decision support models in a hospital or healthcare setting. It does so in a secure and privacy-preserving manner, combining the flexibility of Docker virtualization with the advantages of standardized vocabularies, a widely applied database schema (OMOP-CDM), and a standardized way to exchange medical data (FHIR)

    R Packages for Data Quality Assessments and Data Monitoring: A Software Scoping Review with Recommendations for Future Developments

    No full text
    Data quality assessments (DQA) are necessary to ensure valid research results. Despite the growing availability of tools of relevance for DQA in the R language, a systematic comparison of their functionalities is missing. Therefore, we review R packages related to data quality (DQ) and assess their scope against a DQ framework for observational health studies. Based on a systematic search, we screened more than 140 R packages related to DQA in the Comprehensive R Archive Network. From these, we selected packages which target at least three of the four DQ dimensions (integrity, completeness, consistency, accuracy) in a reference framework. We evaluated the resulting 27 packages for general features (e.g., usability, metadata handling, output types, descriptive statistics) and the possible assessment’s breadth. To facilitate comparisons, we applied all packages to a publicly available dataset from a cohort study. We found that the packages’ scope varies considerably regarding functionalities and usability. Only three packages follow a DQ concept, and some offer an extensive rule-based issue analysis. However, the reference framework does not include a few implemented functionalities, and it should be broadened accordingly. Improved use of metadata to empower DQA and user-friendliness enhancement, such as GUIs and reports that grade the severity of DQ issues, stand out as the main directions for future developments

    Integrative Bioinformatic Analyses of Global Transcriptome Data Decipher Novel Molecular Insights into Cardiac Anti-Fibrotic Therapies.

    No full text
    Integrative bioinformatics is an emerging field in the big data era, offering a steadily increasing number of algorithms and analysis tools. However, for researchers in experimental life sciences it is often difficult to follow and properly apply the bioinformatical methods in order to unravel the complexity and systemic effects of omics data. Here, we present an integrative bioinformatics pipeline to decipher crucial biological insights from global transcriptome profiling data to validate innovative therapeutics. It is available as a web application for an interactive and simplified analysis without the need for programming skills or deep bioinformatics background. The approach was applied to an ex vivo cardiac model treated with natural anti-fibrotic compounds and we obtained new mechanistic insights into their anti-fibrotic action and molecular interplay with miRNAs in cardiac fibrosis. Several gene pathways associated with proliferation, extracellular matrix processes and wound healing were altered, and we could identify micro (mi) RNA-21-5p and miRNA-223-3p as key molecular components related to the anti-fibrotic treatment. Importantly, our pipeline is not restricted to a specific cell type or disease and can be broadly applied to better understand the unprecedented level of complexity in big data research

    Reduced rate of inpatient hospital admissions in 18 german university hospitals during the COVID-19 lockdown

    Get PDF
    The COVID-19 pandemic has caused strains on health systems worldwide disrupting routine hospital services for all non-COVID patients. Within this retrospective study, we analyzed inpatient hospital admissions across 18 German university hospitals during the 2020 lockdown period compared to 2018. Patients admitted to hospital between January 1 and May 31, 2020 and the corresponding periods in 2018 and 2019 were included in this study. Data derived from electronic health records were collected and analyzed using the data integration center infrastructure implemented in the university hospitals that are part of the four consortia funded by the German Medical Informatics Initiative. Admissions were grouped and counted by ICD 10 chapters and specific reasons for treatment at each site. Pooled aggregated data were centrally analyzed with descriptive statistics to compare absolute and relative differences between time periods of different years. The results illustrate how care process adoptions depended on the COVID-19 epidemiological situation and the criticality of the disease. Overall inpatient hospital admissions decreased by 35% in weeks 1 to 4 and by 30.3% in weeks 5 to 8 after the lockdown announcement compared to 2018. Even hospital admissions for critical care conditions such as malignant cancer treatments were reduced. We also noted a high reduction of emergency admissions such as myocardial infarction (38.7%), whereas the reduction in stroke admissions was smaller (19.6%). In contrast, we observed a considerable reduction in admissions for non-critical clinical situations, such as hysterectomies for benign tumors (78.8%) and hip replacements due to arthrosis (82.4%). In summary, our study shows that the university hospital admission rates in Germany were substantially reduced following the national COVID-19 lockdown. These included critical care or emergency conditions in which deferral is expected to impair clinical outcomes. Future studies are needed to delineate how appropriate medical care of critically ill patients can be maintained during a pandemic
    corecore